23 research outputs found

    A generalized framework to predict continuous scores from medical ordinal labels

    Full text link
    Many variables of interest in clinical medicine, like disease severity, are recorded using discrete ordinal categories such as normal/mild/moderate/severe. These labels are used to train and evaluate disease severity prediction models. However, ordinal categories represent a simplification of an underlying continuous severity spectrum. Using continuous scores instead of ordinal categories is more sensitive to detecting small changes in disease severity over time. Here, we present a generalized framework that accurately predicts continuously valued variables using only discrete ordinal labels during model development. We found that for three clinical prediction tasks, models that take the ordinal relationship of the training labels into account outperformed conventional multi-class classification models. Particularly the continuous scores generated by ordinal classification and regression models showed a significantly higher correlation with expert rankings of disease severity and lower mean squared errors compared to the multi-class classification models. Furthermore, the use of MC dropout significantly improved the ability of all evaluated deep learning approaches to predict continuously valued scores that truthfully reflect the underlying continuous target variable. We showed that accurate continuously valued predictions can be generated even if the model development only involves discrete ordinal labels. The novel framework has been validated on three different clinical prediction tasks and has proven to bridge the gap between discrete ordinal labels and the underlying continuously valued variables

    Characterization of Errors in Retinopathy of Prematurity Diagnosis by Ophthalmologists-in-Training in the United States and Canada

    Get PDF
    PURPOSE: To identify the prominent factors that lead to misdiagnosis of retinopathy of prematurity (ROP) by ophthalmologists-in-training in the United States and Canada. METHODS: This prospective cohort study included 32 ophthalmologists-in-training at six ophthalmology training programs in the United States and Canada. Twenty web-based cases of ROP using wide-field retinal images were presented, and ophthalmologists-in-training were asked to diagnose plus disease, zone, stage, and category for each eye. Responses were compared to a consensus reference standard diagnosis for accuracy, which was established by combining the clinical diagnosis and the image-based diagnosis by multiple experts. The types of diagnostic errors that occurred were analyzed with descriptive and chi-squared analysis. Main outcome measures were frequency of types (category, zone, stage, plus disease) of diagnostic errors; association of errors in zone, stage, and plus disease diagnosis with incorrectly identified category; and performance of ophthalmologists-in-training across postgraduate years. RESULTS: Category of ROP was misdiagnosed at a rate of 48%. Errors in classification of plus disease were most commonly associated with misdiagnosis of treatment-requiring (plus error rate = 16% when treatment-requiring was correctly diagnosed vs 81% when underdiagnosed as type 2 or pre-plus; mean difference: 64.3; 95% CI: 51.9 to 76.7; CONCLUSIONS: Ophthalmologists-in-training in the United States and Canada misdiagnosed ROP nearly half of the time, with incorrect identification of plus disease as a leading cause. Integration of structured learning for ROP in residency education may improve diagnostic competency

    Fully automated disease severity assessment and treatment monitoring in retinopathy of prematurity using deep learning

    Get PDF
    Retinopathy of prematurity (ROP) is a disease that affects premature infants, where abnormal growth of the retinal blood vessels can lead to blindness unless treated accordingly. Infants considered at risk of severe ROP are monitored for symptoms of plus disease, characterized by arterial tortuosity and venous dilation at the posterior pole, with a standard photographic definition. Disagreement among ROP experts in diagnosing plus disease has driven the development of computer-based methods that classify images based on hand-crafted features extracted from the vasculature. However, most of these approaches are semi-automated, which are time-consuming and subject to variability. In contrast, deep learning is a fully automated approach that has shown great promise in a wide variety of domains, including medical genetics, informatics and imaging. Convolutional neural networks (CNNs) are deep networks which learn rich representations of disease features that are highly robust to variations in acquisition and image quality. In this study, we utilized a U-Net architecture to perform vessel segmentation and then a GoogLeNet to perform disease classification. The classifier was trained on 3,000 retinal images and validated on an independent test set of patients with different observed progressions and treatments. We show that our fully automated algorithm can be used to monitor the progression of plus disease over multiple patient visits with results that are consistent with the experts’ consensus diagnosis. Future work will aim to further validate the method on larger cohorts of patients to assess its applicability within the clinic as a treatment monitoring tool

    Development and international validation of custom-engineered and code-free deep-learning models for detection of plus disease in retinopathy of prematurity: a retrospective study

    Get PDF
    BACKGROUND: Retinopathy of prematurity (ROP), a leading cause of childhood blindness, is diagnosed through interval screening by paediatric ophthalmologists. However, improved survival of premature neonates coupled with a scarcity of available experts has raised concerns about the sustainability of this approach. We aimed to develop bespoke and code-free deep learning-based classifiers for plus disease, a hallmark of ROP, in an ethnically diverse population in London, UK, and externally validate them in ethnically, geographically, and socioeconomically diverse populations in four countries and three continents. Code-free deep learning is not reliant on the availability of expertly trained data scientists, thus being of particular potential benefit for low resource health-care settings. METHODS: This retrospective cohort study used retinal images from 1370 neonates admitted to a neonatal unit at Homerton University Hospital NHS Foundation Trust, London, UK, between 2008 and 2018. Images were acquired using a Retcam Version 2 device (Natus Medical, Pleasanton, CA, USA) on all babies who were either born at less than 32 weeks gestational age or had a birthweight of less than 1501 g. Each images was graded by two junior ophthalmologists with disagreements adjudicated by a senior paediatric ophthalmologist. Bespoke and code-free deep learning models (CFDL) were developed for the discrimination of healthy, pre-plus disease, and plus disease. Performance was assessed internally on 200 images with the majority vote of three senior paediatric ophthalmologists as the reference standard. External validation was on 338 retinal images from four separate datasets from the USA, Brazil, and Egypt with images derived from Retcam and the 3nethra neo device (Forus Health, Bengaluru, India). FINDINGS: Of the 7414 retinal images in the original dataset, 6141 images were used in the final development dataset. For the discrimination of healthy versus pre-plus or plus disease, the bespoke model had an area under the curve (AUC) of 0·986 (95% CI 0·973-0·996) and the CFDL model had an AUC of 0·989 (0·979-0·997) on the internal test set. Both models generalised well to external validation test sets acquired using the Retcam for discriminating healthy from pre-plus or plus disease (bespoke range was 0·975-1·000 and CFDL range was 0·969-0·995). The CFDL model was inferior to the bespoke model on discriminating pre-plus disease from healthy or plus disease in the USA dataset (CFDL 0·808 [95% CI 0·671-0·909, bespoke 0·942 [0·892-0·982]], p=0·0070). Performance also reduced when tested on the 3nethra neo imaging device (CFDL 0·865 [0·742-0·965] and bespoke 0·891 [0·783-0·977]). INTERPRETATION: Both bespoke and CFDL models conferred similar performance to senior paediatric ophthalmologists for discriminating healthy retinal images from ones with features of pre-plus or plus disease; however, CFDL models might generalise less well when considering minority classes. Care should be taken when testing on data acquired using alternative imaging devices from that used for the development dataset. Our study justifies further validation of plus disease classifiers in ROP screening and supports a potential role for code-free approaches to help prevent blindness in vulnerable neonates

    Classification and comparison via neural networks

    No full text
    We consider learning from comparison labels generated as follows: given two samples in a dataset, a labeler produces a label indicating their relative order. Such comparison labels scale quadratically with the dataset size; most importantly, in practice, they often exhibit lower variance compared to class labels. We propose a new neural network architecture based on siamese networks to incorporate both class and comparison labels in the same training pipeline, using Bradley–Terry and Thurstone loss functions. Our architecture leads to a significant improvement in predicting both class and comparison labels, increasing classification AUC by as much as 35% and comparison AUC by as much as 6% on several real-life datasets. We further show that, by incorporating comparisons, training from few samples becomes possible: a deep neural network of 5.9 million parameters trained on 80 images attains a 0.92 AUC when incorporating comparisons

    Automated Diagnosis of Plus Disease in Retinopathy of Prematurity Using Deep Convolutional Neural Networks

    No full text
    Importance Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. Objective To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. Design, Setting, and Participants A deep convolutional neural network was trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre–plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP. Data were collected from July 2011 to December 2016. Data were analyzed from December 2016 to September 2017. Exposures A deep learning algorithm trained on retinal photographs. Main Outcomes and Measures Receiver operating characteristic analysis was performed to evaluate performance of the algorithm against the RSD. Quadratic-weighted κ coefficients were calculated for ternary classification (ie, normal, pre–plus disease, and plus disease) to measure agreement with the RSD and 8 independent experts. Results Of the 5511 included retinal photographs, 4535 (82.3%) were graded as normal, 805 (14.6%) as pre–plus disease, and 172 (3.1%) as plus disease, based on the RSD. Mean (SD) area under the receiver operating characteristic curve statistics were 0.94 (0.01) for the diagnosis of normal (vs pre–plus disease or plus disease) and 0.98 (0.01) for the diagnosis of plus disease (vs normal or pre–plus disease). For diagnosis of plus disease in an independent test set of 100 retinal images, the algorithm achieved a sensitivity of 93% with 94% specificity. For detection of pre–plus disease or worse, the sensitivity and specificity were 100% and 94%, respectively. On the same test set, the algorithm achieved a quadratic-weighted κ coefficient of 0.92 compared with the RSD, outperforming 6 of 8 ROP experts. Conclusions and Relevance This fully automated algorithm diagnosed plus disease in ROP with comparable or better accuracy than human experts. This has potential applications in disease detection, monitoring, and prognosis in infants at risk of ROP

    Evaluation of a deep learning image assessment system for detecting severe retinopathy of prematurity

    No full text
    Background Prior work has demonstrated the near-perfect accuracy of a deep learning retinal image analysis system for diagnosing plus disease in retinopathy of prematurity (ROP). Here we assess the screening potential of this scoring system by determining its ability to detect all components of ROP diagnosis. Methods Clinical examination and fundus photography were performed at seven participating centres. A deep learning system was trained to detect plus disease, generating a quantitative assessment of retinal vascular abnormality (the i-ROP plus score) on a 1–9 scale. Overall ROP disease category was established using a consensus reference standard diagnosis combining clinical and image-based diagnosis. Experts then ranked ordered a second data set of 100 posterior images according to overall ROP severity. Results 4861 examinations from 870 infants were analysed. 155 examinations (3%) had a reference standard diagnosis of type 1 ROP. The i-ROP deep learning (DL) vascular severity score had an area under the receiver operating curve of 0.960 for detecting type 1 ROP. Establishing a threshold i-ROP DL score of 3 conferred 94% sensitivity, 79% specificity, 13% positive predictive value and 99.7% negative predictive value for type 1 ROP. There was strong correlation between expert rank ordering of overall ROP severity and the i-ROP DL vascular severity score (Spearman correlation coefficient=0.93; p<0.0001). Conclusion The i-ROP DL system accurately identifies diagnostic categories and overall disease severity in an automated fashion, after being trained only on posterior pole vascular morphology. These data provide proof of concept that a deep learning screening platform could improve objectivity of ROP diagnosis and accessibility of screening

    Panretinal handheld OCT angiography for pediatric retinal imaging

    No full text
    Comprehensive visualization of retina morphology is essential in the diagnosis and management of retinal diseases in pediatric populations. Conventional imaging techniques often face challenges in effectively capturing the peripheral retina, primarily due to the limitations in current optical designs, which lack the necessary field of view to characterize the far periphery. To address this gap, our study introduces a novel ultra-widefield optical coherence tomography angiography (OCTA) system. This system, specifically tailored for pediatric applications, incorporates an ultrahigh-speed 800 kHz swept-source laser. The system’s innovative design achieves a 140° field of view while maintaining excellent optical performance. Over the last 15 months, we have conducted 379 eye examinations on 96 babies using this system. It demonstrates marked efficacy in the diagnosis of retinopathy of prematurity, providing detailed and comprehensive peripheral retinal angiography. The capabilities of the ultra-widefield handheld OCTA system in enhancing the clarity and thoroughness of retina vascularization assessments have significantly improved the precision of diagnoses and the customization of treatment strategies. Our findings underscore the system’s potential to advance pediatric ophthalmology and broaden the scope of retinal imaging

    Applications of Artificial Intelligence for Retinopathy of Prematurity Screening

    No full text
    OBJECTIVES: Childhood blindness from retinopathy of prematurity (ROP) is increasing as a result of improvements in neonatal care worldwide. We evaluate the effectiveness of artificial intelligence (AI)–based screening in an Indian ROP telemedicine program and whether differences in ROP severity between neonatal care units (NCUs) identified by using AI are related to differences in oxygen-titrating capability. METHODS: External validation study of an existing AI-based quantitative severity scale for ROP on a data set of images from the Retinopathy of Prematurity Eradication Save Our Sight ROP telemedicine program in India. All images were assigned an ROP severity score (1–9) by using the Imaging and Informatics in Retinopathy of Prematurity Deep Learning system. We calculated the area under the receiver operating characteristic curve and sensitivity and specificity for treatment-requiring retinopathy of prematurity. Using multivariable linear regression, we evaluated the mean and median ROP severity in each NCU as a function of mean birth weight, gestational age, and the presence of oxygen blenders and pulse oxygenation monitors. RESULTS: The area under the receiver operating characteristic curve for detection of treatment requiring retinopathy of prematurity was 0.98, with 100% sensitivity and 78% specificity. We found higher median (interquartile range) ROP severity in NCUs without oxygen blenders and pulse oxygenation monitors, most apparent in bigger infants (.1500 g and 31 weeks’ gestation: 2.7 [2.5–3.0] vs 3.1 [2.4–3.8]; P = .007, with adjustment for birth weight and gestational age). CONCLUSIONS: Integration of AI into ROP screening programs may lead to improved access to care for secondary prevention of ROP and may facilitate assessment of disease epidemiology and NCU resources
    corecore